联邦学习最近在机器学习中迅速发展,引起了各种研究主题。流行的优化算法基于(随机)梯度下降方法的框架或乘数的交替方向方法。在本文中,我们部署了一种确切的惩罚方法来处理联合学习,并提出了一种算法Fedepm,该算法能够解决联合学习中的四个关键问题:沟通效率,计算复杂性,Stragglers的效果和数据隐私。此外,事实证明,它具有收敛性和作证为具有高数值性能。
translated by 谷歌翻译
步骤函数是深神经网络(DNN)最简单,最自然的激活函数之一。由于它计算为1的正变量,而对于其他变量为0,因此其内在特征(例如不连续性,没有可行的亚级别信息)阻碍了其几十年的发展。即使在设计具有连续激活功能的DNN方面有令人印象深刻的工作,可以被视为步骤功能的替代物,它仍然具有某些优势属性,例如对异常值的完全稳健性并能够达到能力预测准确性的最佳学习理论保证。因此,在本文中,我们的目标是用用作激活函数的步骤函数训练DNN(称为0/1 DNNS)。我们首先将0/1 dnns重新加密为不受约束的优化问题,然后通过块坐标下降(BCD)方法解决它。此外,我们为BCD的子问题及其收敛性获得了封闭式解决方案。此外,我们还将$ \ ell_ {2,0} $ - 正则化整合到0/1 DNN中,以加速培训过程并压缩网络量表。结果,所提出的算法在分类MNIST和时尚数据集方面具有高性能。
translated by 谷歌翻译
Federated learning has shown its advances recently but is still facing many challenges, such as how algorithms save communication resources and reduce computational costs, and whether they converge. To address these critical issues, we propose a hybrid federated learning algorithm (FedGiA) that combines the gradient descent and the inexact alternating direction method of multipliers. The proposed algorithm is more communication- and computation-efficient than several state-of-the-art algorithms theoretically and numerically. Moreover, it also converges globally under mild conditions.
translated by 谷歌翻译
One of the crucial issues in federated learning is how to develop efficient optimization algorithms. Most of the current ones require full device participation and/or impose strong assumptions for convergence. Different from the widely-used gradient descent-based algorithms, in this paper, we develop an inexact alternating direction method of multipliers (ADMM), which is both computation- and communication-efficient, capable of combating the stragglers' effect, and convergent under mild conditions. Furthermore, it has a high numerical performance compared with several state-of-the-art algorithms for federated learning.
translated by 谷歌翻译
深度神经网络(DNN)的基本限制之一是无法获取和积累新的认知能力。当出现一些新数据时,例如未在规定的对象集中识别的新对象类别,传统的DNN将无法识别它们由于它需要的基本配方。目前的解决方案通常是从新扩展的数据集中重新设计并重新学习整个网络,或者使用新的配置进行新配置以适应新的知识。这个过程与人类学习者的进程完全不同。在本文中,我们提出了一种新的学习方法,名为ACCRetionary学习(AL)以模拟人类学习,因为可以不预先指定要识别的对象集。相应的学习结构是模块化的,可以动态扩展以注册和使用新知识。在增值学习期间,学习过程不要求系统完全重新设计并重新培训,因为该组对象大小增长。在学习识别新数据类时,所提出的DNN结构不会忘记以前的知识。我们表明,新的结构和设计方法导致了一个系统,可以增长以应对增加的认知复杂性,同时提供稳定和卓越的整体性能。
translated by 谷歌翻译
基于深度神经网络(DNN)的超分辨率算法大大提高了所生成的图像的质量。然而,由于学习错位光学变焦的困难,这些算法通常会在处理现实世界超分辨率问题时产生重要的伪像。在本文中,我们介绍了一个平方可变形对准网络(SDAN)来解决这个问题。我们的网络了解卷积内核的平方每点偏移,然后基于偏移来对齐纠正卷积窗口的功能。因此,通过提取的对齐的特征将最小化未对准。与Vanilla可变形卷积网络(DCN)中使用的每点偏移不同,我们提出的平方抵消不仅加速了偏移学习,而且还提高了更少参数的发电质量。此外,我们进一步提出了一种高效的交叉包装注意层来提高学习偏移的准确性。它利用包装和解包操作来扩大偏移学习的接收领域,并增强提取低分辨率图像与参考图像之间的空间连接的能力。综合实验在计算效率和现实细节方面表现出我们对其他最先进的方法的方法。
translated by 谷歌翻译
Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
translated by 谷歌翻译
For Prognostics and Health Management (PHM) of Lithium-ion (Li-ion) batteries, many models have been established to characterize their degradation process. The existing empirical or physical models can reveal important information regarding the degradation dynamics. However, there is no general and flexible methods to fuse the information represented by those models. Physics-Informed Neural Network (PINN) is an efficient tool to fuse empirical or physical dynamic models with data-driven models. To take full advantage of various information sources, we propose a model fusion scheme based on PINN. It is implemented by developing a semi-empirical semi-physical Partial Differential Equation (PDE) to model the degradation dynamics of Li-ion-batteries. When there is little prior knowledge about the dynamics, we leverage the data-driven Deep Hidden Physics Model (DeepHPM) to discover the underlying governing dynamic models. The uncovered dynamics information is then fused with that mined by the surrogate neural network in the PINN framework. Moreover, an uncertainty-based adaptive weighting method is employed to balance the multiple learning tasks when training the PINN. The proposed methods are verified on a public dataset of Li-ion Phosphate (LFP)/graphite batteries.
translated by 谷歌翻译
Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of which is of high possibility to be degraded due to noises and distortions. In this paper, we propose two novel NLOS reconstruction models based on curvature regularization, i.e., the object-domain curvature regularization model and the dual (i.e., signal and object)-domain curvature regularization model. Fast numerical optimization algorithms are developed relying on the alternating direction method of multipliers (ADMM) with the backtracking stepsize rule, which are further accelerated by GPU implementation. We evaluate the proposed algorithms on both synthetic and real datasets, which achieve state-of-the-art performance, especially in the compressed sensing setting. All our codes and data are available at https://github.com/Duanlab123/CurvNLOS.
translated by 谷歌翻译
Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (DM), we sequentially sample multiple masked views per image in a mini-batch with the disjoint regulation to raise the usage of tokens for reconstruction in each image while keeping the masking rate of each view. For joint distillation (JD), we adopt a dual branch architecture to respectively predict invisible (masked) and visible (unmasked) tokens with superior learning targets. Rooting in orthogonal perspectives for training efficiency improvement, DM and JD cooperatively accelerate the training convergence yet not sacrificing the model generalization ability. Concretely, DM can train ViT with half of the effective training epochs (3.7 times less time-consuming) to report competitive performance. With JD, our DMJD clearly improves the linear probing classification accuracy over ConvMAE by 5.8%. On fine-grained downstream tasks like semantic segmentation, object detection, etc., our DMJD also presents superior generalization compared with state-of-the-art SSL methods. The code and model will be made public at https://github.com/mx-mark/DMJD.
translated by 谷歌翻译